Improving Statistical Machine Translation with Processing Shallow Parsing
نویسندگان
چکیده
Reordering is of essential importance for phrase based statistical machine translation (SMT). In this paper, we would like to present a new method of reordering in phrase based SMT. We inspired from (Xia and McCord, 2004) using preprocessing reordering approaches. We used shallow parsing and transformation rules to reorder the source sentence. The experiment results from English-Vietnamese pair showed that our approach achieves significant improvements over MOSES which is the state-of-the art phrase based system.
منابع مشابه
برچسبزنی خودکار نقشهای معنایی در جملات فارسی به کمک درختهای وابستگی
Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...
متن کاملBilingually-Constrained Recursive Neural Networks with Syntactic Constraints for Hierarchical Translation Model
Hierarchical phrase-based translation models have advanced statistical machine translation (SMT). Because such models can improve leveraging of syntactic information, two types of methods (leveraging source parsing and leveraging shallow parsing) are applied to introduce syntactic constraints into translation models. In this paper, we propose a bilingually-constrained recursive neural network (...
متن کاملReordering Models for Statistical Machine Translation: A Literature Survey
In this survey, we briefly study various reordering models that are used with statistical translation models. Reordering model is one of the important component of any statistical machine translation system. Problem of reordering is NP-Hard itself. In this survey, we study various reordering approaches that can be used to solve this problem. We first study simple distortion-based reordering whi...
متن کاملStatistical Machine Translation Using Coercive Two-Level Syntactic Transduction
We define, implement and evaluate a novel model for statistical machine translation, which is based on shallow syntactic analysis (part-of-speech tagging and phrase chunking) in both the source and target languages. It is able to model long-distance constituent motion and other syntactic phenomena without requiring a full parse in either language. We also examine aspects of lexical transfer, su...
متن کاملA Hybrid Machine Translation System for Typologically Related Languages
This paper describes a shallow parsing formalism aiming at machine translation between closely related languages. The formalism allows to write grammar rules helping to (partially) disambiguate chunks in input sentences. The chunks are then translatred into the target language without any deep syntactic or semantic processing. A stochastic ranker then selects the best translation according to t...
متن کامل